Advances in variable selection methods II: Effect of variable selection method on classification of hydrologically similar watersheds in three Mid-Atlantic ecoregions
نویسندگان
چکیده
0022-1694/$ see front matter 2012 Elsevier B.V. A doi:10.1016/j.jhydrol.2012.01.035 ⇑ Corresponding author. Tel.: +1 706 202 8193. E-mail addresses: [email protected] (H. Ssegane), Tollner), [email protected] (Y.M. Mo (T.C. Rasmussen), [email protected] (J.F. Dowd). Hydrological flow predictions in ungauged and sparsely gauged watersheds use regionalization or classification of hydrologically similar watersheds to develop empirical relationships between hydrologic, climatic, and watershed variables. The watershed classifications may be based on geographic proximity, regional frameworks such as ecoregions or classification using cluster analysis of watershed descriptors. General approaches used in classifying hydrologically similar watersheds use climatic and watershed variables or statistics of streamflow data. Use of climatic and watershed descriptors requires variable selection to minimize redundancy from a large pool of potential variables. This study compares classification performance of four variable groups to identify homogeneous watersheds in three MidAtlantic ecoregions (USA): Appalachian Plateau, Piedmont, and Ridge and Valley. The variable groups included: (1) variables that define watershed geographic proximity; (2) variables that define watershed hypsometry; (3) variables selected using causal selection algorithms; and (4) variables selected using principal component analysis (PCA) and stepwise regression. The classification results were compared to reference watersheds classified as homogeneous using three streamflow indices: Slope of flow duration curve; Baseflow index; and Streamflow elasticity using a similarity index (SI). Classification performance was highest using variables selected by causal algorithms (e.g., HITON-MB method, SI = 0.71 for Appalachian Plateau, SI = 0.90 for Piedmont, and SI = 0.72 for Ridge and Valley) compared to variables selected by stepwise regression (SI = 0.72 for Appalachian Plateau, SI = 0.87 for Piedmont, and SI = 0.64 for Ridge and Valley) and PCA (SI = 0.71 for Appalachian Plateau, SI = 0.76 for Piedmont, and SI = 0.57 for Ridge and Valley). 2012 Elsevier B.V. All rights reserved.
منابع مشابه
Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation
1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...
متن کاملA Comparison between New Estimation and variable Selectiion method in Regression models by Using Simulation
In this paper some new methods whitch very recently have been introduced for parameter estimation and variable selection in regression models are reviewd. Furthermore , we simulate several models in order to evaluate the performance of these methods under diffrent situation. At last we compare the performance of these methods with that of the regular traditional variable selection methods such ...
متن کاملAn Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملFinding stability regions for preserving efficiency classification of variable returns to scale technology in data envelopment analysis
This paper addresses issue of sensitivity of efficiency classification of variable returns to scale (VRS) technology for enhancing the credibility of data envelopment analysis (DEA) results in practical applications when an additional decision making unit (DMU) needs to be added to the set being considered. It also develops a structured approach to assisting practitioners in making an appropria...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کامل